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Method for delivering nucleic acid into eukaryotic genomes 



The present invention relates to genetic engineering and especially to the use of DNA 
transposition complex of bacteriophage Mu. In particular, the invention provides a gene 
transfer system for eukaryotic cells, wherein in vitro assembled Mu transposition 
complexes are introduced into a target cell. Inside the cell, the complexes readily mediate 
integration of a transposon construct into a cellular nucleic acid. The invention further 
provides a kit for producing insertional mutations into the genomes of eukaryotic cells. The 
kit can be used, e.g., to generate insertional mutant libraries. 

BACKGROUND OF THE INVENTION 

Efficient transfer of nucleic acid into a target cell is prerequisite for the success of almost 
any molecular biology application. The transfer of nucleic acid into various types of cells 
provides means to study gene function in living organisms, to express exogenous genes, or 
to regulate cell functions such as protein expression. Stably transferred inserts can also be 
used as primer binding sites in sequencing projects. In principle, the transfer can be 
classified as transient or stable. In the former case the transferred genetic material will 
eventually disappear from the target cells. Transient gene transfer typically utilizes plasmid 
constructions that do not replicate within the host cell. Because vector molecules that 
would replicate in mammalian cells are scarce, and in essence they are limited to those 
involving vural replicons (i.e. no plasmids available), the transient transfer strategy is in 
many cases the only straightforward gene transfer strategy for mammalian cells. For other 
types of cells, e.g. bacterial and lower eukaryotes such as yeast, replicating plasmids are 
available and therefore transient expression needs to be used only in certain specific 
situations in which some benefits can be envisioned (e.g. conditional expression). 

In many cases stable gene transfer is the preferred option. For bacteria and lower 
eukaryotes plasmids that replicate within the cells are available. Accordingly, these DNA 
molecules can be used as gene delivery vehicles. However, the copy numbers of such 
plasmids typically exceeds one or two and therefore the transferred genes increase the gene 
dosage substantially. Typically used plasmids for bacteria and yeasts are present in tens or 
hundreds of copies. Increased gene dosage compared to normal situation is a potential 
source of artefactual or at least biased experimental results in many systems. Therefore, it 
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would be advantageous to generate situations in which single-copy gene transfer (per 
haploid genome) would be possible. 

In general, stable single-copy gene transfer can be achieved if transferred DNA can be 
inserted into the target cell's chromosomal DNA. Traditionally, this has been achieved by 
usmg different types of recombination reactions. In bacteria, homologous recombination 
and site-specific recombination are both widely used and in some cases yet less well 
characterized "illegitimate" recombination may be used. The choice of a method typically 
depends on whether a random or targeted mutation is required. While some of these 
methods are relatively trivial to use for a subset of the bacterial species, a general-purpose 
method would be more desirable. 

Recombination reactions may also be used to stably transfer DNA into eukaxyotic cell's 
chromosomal DNA. Homologous and site-specific recombination reactions produce 
targeted integrations, and "illegitimate" recombination generates non-targeted events. 
Utilization of transpositional recombination has been described for baker's yeast 
Saccharomyces cerevisiae (Ji et al 1993) and for fission yeast Schizosaccharomyces 
pombe (Behrens et al 2000). These strategies involve in vivo transposition in which the 
transposon is launched fi-om within the cell itself They utilize suitably modified 
transposons in combination with transposase proteins that are produced within a given cell 
Similar systems, in which transposase proteins are produced within cells, are available also 
for other eukaryotic organisms; typical examples include Drosophila and Zebra fish 
(Rubin and Spradling 1982, Raz et al. 1997). 

While transposition systems based on in vivo expression of the transposition machinery are 
relatively straightforward to use they are not an optimal choice for gene transfer for various 
reasons. For example, efficiency as well as the host-range may be limited, and target site 
selection may not be optimal. Viral systems, especially retroviral insertion methods, have 
been used to generate genomic insertions for animal cells. These strategies also have some 
disadvantageous properties. For example, immune response may be elicited as a response 
to vu-ally-encoded proteins, and in general, constructing safe and efficient virus vectors and 
respective packaging cell lines for a given application is not necessarily a trivial task. 
Therefore, also for eukaryotic cells, a general-purpose random non-viral DNA insertion 
strategy would be desirable. Introduction of in vitro-assembled transposition complexes 



into the cells may be a choice. It is likely that utilization of in v//ro-assembled DNA 
transposition complexes may be one of the most versatile systems for gene transfer 
Recently, such a system for bacterial cells has been described and it utilizes chemical 
reactions based on transpositional DNA recombination (US 6.159,736 and US 6 294 385) 
Efficient systems are expected to provide a pool of mutants that can be used various ways 
to study many types of aspects of cellular life. These mutant pools are essential for studies 
mvolving whole genomes (i.e. functional genomics studies). However, a priori it is not 
possible to envision whether ir^ v//..o-assembled DNA transposition complexes would work 
when mtroduced into eukaryotic cells, especially if the components are derived from the 
prokaryota. The difference between prokaryotic and eukaryotic cells, especially the - 
presence of nuclear membrane and packaging of eukaryotic genomic DNA into chromatin 
structure, may prevent the prokaryotic systems from functioning. In addition, in view of 
the stability and catalytic activity of the transposition complex, conditions within 
eukaryotic cells may be substantially different from prokaryotic cells. In addition other 
unknown restriction system(s) may fight against incoming DNA and non-specific proteases 
may destroy assembled transposition complexes before they execute their function for 
integration. Furthermore, even if the transpositional reaction integrates the transposon into 
the genome, the ensuing 5-bp single-stranded regions (and in some cases 4-nt flanking 
DNA flaps) would need to be corrected by the host. Therefore, it is clear that the stability 
and efficiency of transposition complexes inside a eukaryotic cell camrot be predicted from 
the results with bacterial cells as disclosed in US 6,159.736 and US 6,294.385. Thus to 
date there is no indication in the prior art that in v,>c-assembled transposition complexes 
can generally be used for nucleic acid transfer into the cells of higher organisms (i.e. 
eukaryotes). 

Bacteriophage Mu replicates its genome using DNA transposition machinery and is one of 
the best characterized mobile genetic elements (Mizuuchi 1992; Chaconas et al., 1996) We 
Utilised for tiie present invention a bacteriophage Mu-derived in .itro transposition system 
that has been inti-oduced recently (Haapa et al. 1999a). Mu transposition complex, the 
machinery within which the chemical steps of transposition take place, is initially 
assembled from four MuA tiransposase protein molecules that first bind to specific binding 
sites m the transposon ends. The 50 bp Mu right end DNA segment contains two of these 
binding sites (they are called Rl and R2 and each of them is 22 bp long. Savilahti et al. 
1995). When two transposon ends meet, each bound by two MuA monomers, a 
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.ra„spos,.,o„ complex is fo™ed tough ccnfon>«U„„al change. -n,en Mu t^^i^on 
proceeds w,uu„ fte context of .aid traosposiHon complex, i.e., p™.ei„.DNA complexes 
that are also called DNA transposition complexes or transpososotnes (Mizuuchi 1992 
Savilahti et al. ,995). F„„c«ona, core of these complexes are assembled fiom a te.ra,:er of 
MuA transposasc protein and Mu-t«u>sposon-derived DNA-end-segments (i.e. ttansposon 
end sequences recognised by MuA) containing MuA binding sites. When the core 
complexes are formed they can react in divalent metal ion-dependen. manner with any 

target DNA and insert the Mu end segments into the target (Savilahtietal 1995) A 
hallmark of Mu transposition is «ae generation of a S-bp Urget site duplication (Alie^ 
10 1979; Kahmann and Kamp, 1979). 

to the Simplest case, the MuA transposase protein and a short 50 bp Mu right-end (R-end) 
fragment are the only macromolecular components required for transposition complex 
assembly and function (Savilahti et al. 1995, SavilahU and Mi^uchi 1996). Analogously 

15 -hen two R-end sequences are located as inverted terminal repeats ma longer DNA ' 
molecule, transposition complexes fonn by synapsing the transposon ends. Target DNA in 
the Mu DNA ,„ vUro transposition reaction can be linear, open circular, or supercoiled 
(Haapa et al. 1999a). 
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To date Mu m yUro transposition-based strategies have been utilized efHciendy for a 
varrety of molecular biology applications including DNA sequencing (Haapa et al. 1999a- 
Butterfield et al. 2002), generation of DNA constructions for gene targeting (Vilen e. al.. ' 
2001 , ™d tactionai analysis of plasmid and viral (HIV) genomic DNA regions (Haapa et 
al.. 1999b, Laurent et al., 2000). Also, functional genomics smdies on whole virus 
genomes otpo,.,o .irus A and bacteriophage PRDl have been conducted using the Mu /„ 
y.tro transposition-based approaches (Kekarainen e, .1., 2002, Vilen et al.. 2003) In 
addition, pentapeptide insertion mutagenesis method has been described (Taim et al 
1999). Recently, an insertional mutagenesis strategy for bacterial genomes has been ' 
developed in whichthe in .Uro assembled functional transpososomes were delivered into 
vanous bacterial cells by eIectix>pomtion (Lamberg et al.. 2002). 

B. colt is a,e natural host of bacteriophage Mu. It was first shown with E. coli that in .iiro 
preassembled tiansposition complexes can be eleCroporated into the bacterial cells 
whereby tt.ey U.en integrate the transposon construct into a>e genome (Lamberg e, al 
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2002). The Mu transpososomes were also able to integrate transposons into the genomes of 
three other Gram negative bacteria tested, namely. Salmonella enterica (previously known 
as S. typhimurium), Bry.inia carotovara, and Yersinia enterocolUica (Lamberg et al 2002) 
In each of these four bacterial species the integrated transposons were flanked by a 5-bp 
target site duplication, a hallmark of Mu transposition, thus confimiing that the integrations 
were generated by DNA transposition chemistry. 

SUMMARY OF T HE nSTVFMTTnxf 

We have developed a gene transfer system for eukaryotic cells that utilizes in vitro- 
assembled phage Mu DNA transposition complexes. Linear DNA molecules containing 
appropriate selectable markers and other genes of interest are generated that are flanked by 
DNA sequence elements needed for the binding of MuA transposase protein. Incubation of 
such DNA molecules with MuA protein results in the formation of DNA transposition 
complexes, transpososomes. These can be delivered into eukaryotic cells by 
electroporation or by other related methods. The method described in the present invention 
expands the applicability of the Mu transposon as a gene delivery vehicle into eukaryotes. 

In a first aspect, the invention provides a method for incorporating nucleic acid segments 
into cellular nucleic acid of a eukaryotic target cell, the method comprising the step of: 

delivering into the eukaryotic target cell a Mu transposition complex that comprises 
(1) MuA transposases and (ii) a transposon segment that comprises a pair of Mu end 
sequences recognised and bound by MuA transposase and an insert sequence between said 
Mu end sequences, under conditions that allow integration of the transposon segment into 
25 the cellular nucleic acid. 

In another aspect, the invention features a method for forming an insertion mutant library 
from a pool of eukaryotic target cells, the method comprising the steps of: 
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a) delivering into the eukaryotic target cell a Mu transposition complex that comprises 
(1) MuA transposases and (ii) a transposon segment that comprises a pair of Mu end 
sequences recognised and bound by MuA transposase and an insert sequence between said 
Mu end sequences, under conditions that allow integration of the transposon segment into 
the cellular nucleic acid. 
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b) screening for cells that comprise the selectable marker. 



In a third aspect, the invention provides a kit for incorporating nucleic acid segments into 
5 cellular nucleic acid of a eukaryotic target cell. 

The term "transposon". as used herein, refers to a nucleic acid segment, which is 
recognised by a transposase or an integrase enzyme and which is essential component of a 
functional nucleic acid-protein complex capable of transposition (i.e. a transpososome). 
Mmimal nucleic acid-protein complex capable of transposition in the Mu system 
comprises four MuA transposase protein molecules and a transposon with a pair of Mu end 
sequences that are able to interact with MuA. 
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The term "transposase" used herein refers to an enzyme, which is an essential component 
of a functional nucleic acid-protein complex capable of transposition and which is 
mediating transposition. The term "transposase" also refers to integrases from 
retrotransposons or of retroviral origin. 

The expression "transposition" used herein refers to a reaction wherein a transposon inserts 
Itself into a target nucleic acid. Essential components in a transposition reaction are a 
transposon and a transposase or an integrase enzyme or some other components needed to 
form a functional transposition complex. The gene delivery method and materials of the 
present invention are established by employing the principles of in vitro Mu transposition 
(Haapa et al. 1999ab and Savilahti et al. 1995). 

The term "transposon end sequence" used herein refers to the conserved nucleotide 
sequences at the distal ends of a transposon. The transposon end sequences are responsible 
for identifying the transposon for transposition. 

The tenn "transposon binding sequence" used herein refers to the conserved nucleotide 
sequences within the transposon end sequence whereto a transposase specifically binds 
when mediating transposition. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1. Mini-Mu transposon integration into the yeast chromosomal or plasmid DNA in 
vivo by in vzTro-assembled Mu transposition complexes comprising of a tetramer of MuA 
transposase and a mini-Mu transposon. 

5 

Figure 2, Schematic representation of the Mu-transposons used in this study with the 
relevant restriction sites. (A) Both of the yeast transposons contain 7EF promoter (Ptef), 
kan marker gene and TEF terminator (Tjef) embedded between two 50 bp Mu right end 
sequences. The kanMX4-pl5A-Mu transposon contains the additional pl5A replicon. 

10 Short arrows denote the binding sites of the primers used for sequencing of the out-cloned 
flanking sequences. The BgUl sites in the ends are used to excise the transposon from the 
vector plasmid backbone. (B) The Mu/LoxP-Kan/Neo transposon for transfecting the 
mouse ES cells. It contains kan/neo marker gene between two Mu right end and LoxP 
sequences. The kan/neo marker includes the prokaryotic and eukaryotic promoters and 

1 5 terminators as explained in Materials and methods. 

Figure 3. Mu transposition complex formation with KanMX4-Mu (1.5 kb) and KanMX4- 
pl5A-Mu (2.3 kb) substrates analysed by agarose gel electrophoresis. Substrate DNA was 
incubated with or without MuA, and the reaction products were analysed in the presence or 
20 absence of SDS. Samples were electrophoresed on 2 % agarose gel containing 87 mg/ml 
of heparin and 87 mg/ml of BSA. 

Figure 4. Southem blot analysis of the insertions into the yeast genome. Genomic DNA of 
17 geneticin-resistant FY 1679 clones, resulting from the electroporation of the 
25 transposition complexes into yeast cells, was digested with BamBI +Bgl II (A) or ^mdlll 
(B) and probed with kanMX4 DNA. Lanes 1-17, transposon insertion mutants; C, genomic 
DNA of original S. cerevisiae FY 1679 recipient strain as a negative control; P, linearized 
plasmid DNA containing kanMX4-Mu transposon as a positive control; M, molecular size 
marker. The sizes of plasmid fragments are shown on the left. 

30 

Figure 5. Distribution of kanMX4-Mu integration sites on yeast chromosomes (A) and 
in the repetitive rDNA region on chromosome 12 (B). The ovals in (A) designate the 
centromer of each chromosome. Integration sites in the diploid strain FY 1679 are indicated 
by bars, and the integration sites in the haploid strain FY-3 by bars with filled circles. 
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Above the line representing yeast genomic DNA are indicated the transposons that 
contained the kan gene in the orientation of Watson strand, below the line the transposons 
are in the Crick strand orientation. 

5 DETAILED DESCRIPTION OF THE TNVENTIQN 

The in vitro assembled transposition complex is stable but catalytically inactive in 
conditions devoid of Mg^* or other divalent cations (Savilahti et al, 1995; Savilahti and 
Mizuuchi, 1996). After electroporation into bacterial cells, these complexes remain 
functional and become activated for transposition chemistry upon encountering Mg^"" ions 
10 within the cells, facilitating transposon integration into host chromosomal DNA (Lamberg 
et al., 2002). The in vitro preassembled transpososomes do not need special host cofactors 
for the integration step in vivo (Lamberg et al, 2002). Importantly, once introduced into 
cells and integrated into the genome, the inserted DNA wUl remain stable in cells that do 
not express MuA (Lamberg et al. , 2002), 

15 

To study if the Mu transposition system with the in vitro assembled transpososomes works 
also for higher organisms we constructed transposons (antibiotic resistance markers 
connected to Mu ends), assembled the complexes and tested the transposition strategy and 
target site selection after electi-oporation of yeast or mouse cells. The ti-ansposons were 

20 integrated into the genomes with a 5-bp target site duplication flanking the insertion 
indicating that a genuine DNA tiransposition reaction had occurred. These results 
demonstrate that, surprisingly, the conditions in eukaryotic cells allow the integration of 
Mu DNA. Remarkably, the nuclear membrane, DNA binding proteins, or DNA 
modifications or conformations did not prevent the integration. Furhermore, the stiucture 

25 and catalytic activity of the Mu complex retained even after repeated concenhation steps. 
This expands the applicability of the Mu transposition strategy into eukaryotes. The benefit 
of tiiis system is that there is no need to generate an expression system of the tiransposition 
machinery for the organism of interest. 

30 The invention provides a method for incorporating nucleic acid segments into cellular 
nucleic acid of a eukaryotic target ceil, the method comprising the step of: 

delivering into tiie eukaryotic target cell a Mu transposition complex that comprises 
(i) MuA tiransposases and (ii) a transposon segment that comprises a pair of Mu end 
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sequences recognised and bound by MuA transposase and an insert sequence between said 
Mu end sequences, under conditions that allow integration of the transposon segment into 
the cellular nucleic acid. 
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For the method, one can assemble in vitro stable but catalytically inactive Mu transposition 
complexes in conditions devoid of Mg^* as disclosed in Savilahti et al, 1995 and Savilahti 
and Mizuuchi. 1996. In principal, any standard physiological buffer not containing Mg^* 
is suitable for the assembly of said inactive Mu transposition complexes. However, a 
preferred in vitro transpososome assembly reaction may contain 150 mM Tris-HCl pH 6.0. 
50 % (v/v) glycerol, 0.025 % (w/v) Triton X-100, 150 mM NaCl. 0.1 mM EDTA. 55 nM ' 
transposon DNA fragment, and 245 nM MuA. The reaction volume may be for example 20 
or 80 microliters. The reaction is incubated at about 30°C for 0.5 - 4 h, preferably 2 h. To 
obtain a sufficient amount of transposition complexes for delivery into the cells, the 
reaction is then concentrated and desalted from several assembly reactions. For the yeast 
transformations the final concentration of transposition complexes compared to the 
assembly reaction is preferably at least tenfold and for the mouse cell transfections at least 
20-fold. The concentration step is preferably carried out by using centrifugal filter units. 
Alternatively, it may be carried out by centrifugation or precipitation (e.g. using PEG or 
other types ofprecipitants). 

In the method, the concentrated tranposition complex fraction is delivered into the 
eukaryotic target cell. The preferred delivery method is electroporation. The 
electroporation of Mu transposition complexes into bacterial cells is disclosed in Lamberg 
et al., 2002. However, the method of Lamberg et al cannot be directly employed for 
introduction of the complexes into eukaryotic cells. As shown below in the Experimental 
Section, the procedure for electroporation of mouse embryonic stem (ES) cells described 
by Sands and Hasty (1997) can be used in the method of the invention. A variety of other 
DNA introduction methods are known for eukaryotic cells and the one skilled in the art can 
readily utilize these methods in order to carry out the method of the invention. Such DNA 
delivery methods include direct injections by the aid of needles or syringes, exploitation of 
liposomes, and utilization of various types of transfection-promoting additives. Physical 
methods such as particle bombardment may also be feasible. 
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Transposition into the cellular nucleic acid of the target cell seems to follow directly after 
the electroporation without additional intervention. However, to promote transposition and 
remedy the stress caused by the electroporation, the cells can be incubated at about room 
temperature to 30°C for 10 min - 48 h or longer in a suitable medium before plating or 
other subsequent steps. Preferably, a single insertion into the cellular nucleic acid of the 
target cell is produced. 

The eukaryotic target cell of the method may be a human, animal, plant, fungi or yeast cell. 
Preferably, the animal cell is a cell of a vertebrate such as mouse {Mus musculus), rat 
(Rattus norvegicus). Xenopus. Fugu or zebra fish or an invertebrate such as Drosophila 
melanogaster or Caenorhabditis elegam. The plant cell is preferably from Arabidops is 
thaliana, tobacco or rice. The yeast cell is preferably a cell of Saccharomyces cerevisiae or 
Schizosaccharomyces pombe. 

The insert sequence between Mu end sequences preferably comprises a selectable marker, 
gene or promoter trap or enhancer trap constructions, protein expressing or RNA producing 
sequences. Such constructs renders possible the use of the method in gene tagging, 
functional genomics or gene therapy. 

The term "selectable marker" above refers to a gene that, when carried by a transposon. 
alters the ability of a cell harboring the transposon to grow or survive in a given growth 
environment relative to a similar cell lacking the selectable marker. The transposon nucleic 
acid of the invention preferably contains a positive selectable marker. A positive selectable 
marker, such as an antibiotic resistance, encodes a product that enables the host to grow 
and survive in the presence of an agent, which otherwise would inhibit the growth of the 
organism or kill it. The insert sequence may also contain a reporter gene, which can be any 
gene encoding a product whose expression is detectable and/or quantitatable by 
immunological, chemical, biochemical, biological or mechanical assays. A reporter gene 
product may, for example, have one of the following attributes: fluorescence (e.g., green 
fluorescent protein), enzymatic activity (e.g., luciferase, /acZ/p-galactosidase), toxicity 
(e.g.. ricin) or.an ability to be specifically bound by a second molecule (e.g., biotin). The 
use of markers and reporter genes in eukaryotic cells is well-known in the art. 
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Since the target site selection of in vitro Mu system is known to be random or nearly 
random, one preferred embodiment of the invention is a method, wherein the nucleic acid 
segment is incorporated to a random or almost random position of the cellular nucleic acid 
of the target cell. However, targeting of the transposition can be advantageous in some 
5 cases and thus another preferred embodiment of the invention is a method, wherein the 
nucleic acid segment is incorporated to a targeted position of the cellular nucleic acid of 
the target cell. This could be accomplished by adding to the transposition complex, or to 
the DNA- region between Mu ends in the transposon, a targeting signal on a nucleic acid or 
protein level Said targeting signal is preferably a nucleic acid, protein or peptide which is 
10 known to efficiently bind to or associate with a certain nucleotide sequence, thus 
facilitating targeting. 

One specific embodiment of the invention is the method wherein a modified MuA 
transposase is used. Such MuA transposase may be modified, e.g., by a deletion, an 
15 insertion or a point mutation and it may have different catalytic activities or specifities than 
an unmodified MuA. 

Another embodiment of the invention is a method for forming an insertion mutant library 
from a pool of eukaryotic target cells, the method comprising the steps of: 

20 

a) delivering into the eukaryotic target cell a Mu transposition complex that comprises 
(i) MuA transposases and (ii) a transposon segment that comprises a pair of Mu end 
sequences recognised and bound by MuA transposase and an insert sequence between said 
Mu end sequences, under conditions that allow integration of the transposon segment into 

25 the cellular nucleic acid. 

b) screening for cells that comprise the selectable marker. 

In the above method, a person skilled in the art can easily utilise different screening 
30 techniques. The screening step can be performed, e.g., by methods involving sequence 

analysis, nucleic acid hybridisation, primer extension or antibody binding. These methods 
are well-known in the art (see, for example. Current Protocols in Molecular Biology, eds. 
Ausubel et al, John Wiley & Sons: 1992). Libraries formed according to the the method of 
the invention can also be screened for genotypic or phenotypic changes after transposition. 
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Further embodiment of the invention is a kit for incorporating nucleic acid segments into 
cellular nucleic acid of a eukaryotic target cell. The kit comprises a concentrated fraction 
of Mu transposition complexes that comprise a transposon segment with a marker, which is 
5 selectable in eukaryotic cells. Preferably, said complexes are provided as a substantially 
pure preparation apart from other proteins, genetic material, and the like. 

The publications and other materials used herein to illuminate the background of the 
invention, and in particular, to provide additional details with respect to its practice, are 
10 incorporated herein by reference. The invention will be described in more detail in the 
following Experimental Section. 

EXPERIMENTAL SECTION 
15 MATERIALS AND METHODS 
Strains and media 

The Eschericia coli DH5a was used for bacterial transformations. The bacteria were grown 
at 37 °C in LB broth or on LB agar plates. For the selection and maintenance of plasmids, 

20 antibiotics were used at the following concentrations: ampicillin 100-150 jig/ml, 

kanamycin 10-25 fig/ml, and chloramphenicol 10 |xg/ml. The Saccharomyces cerevisiae 
strain FY1679 {MATa/MATa ura3-52/ura 3-52 his3A200/HIS3 leu2Al/LEU2 
trplA63/TRPl GAL2/GAL2\ Winston et al. 1995) and its haploid derivative FY.3 (AMJa 
HIS LEU TRP ura3-52) were used for yeast transformations. The yeasts were grown on 

25 YPD (1 % yeast extract, 2 % peptone, 2 % glucose) or minimal medium (0.67 % yeast 

nitrogen base, 2 % glucose). For the selection of the transformants, yeast cells were grown 
on YPD plates containing 200 |ig/ml of G418 (geneticin, Sigma). 

The procedures required for propagating mouse AB2.2-Prime embryonic stem (ES) cells 
30 (Lexicon Genetics, Inc.) have been described by Sands and Hasty (1997). Briefly, 
undifferentiated AB2.2-Prime ES cells were grown on 0.1 % gelatin (Sigma)-coated 
tissues culture plates in the ES culture medium consisting of DMEM (Gibco) 
supplemented with 15 % fetal bovine serum (Hyclone), 2 mM L-glutamine (Gibco), 1 mM 
Sodium pyruvate (Gibco), 100 [iM P-Mercaptoethanol and nonessential amino acids 
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(Gibco), 50 U/ml Penicillin, 50 ng/ml Streptomycin (Gibco), and 1000 U/ml LIF 
(Chemicon). 

Proteins and reagents 
5 MuA transposase (MuA), proteinase K, calf intestinal alkaline phosphatase (CIP) and 
Cam'^ Entranceposon (TGS Template Generation System) were obtained from Finnzymes, 
Espoo, Finland. Restriction endonucleases and the plasmid pUC19 were from New 
England Biolabs. Klenow enzyme was from Promega. Enzymes were used as 
recommended by the suppliers. Bovine serum albumin was from Sigma. [a^^P]dCTP 
10 (1000-3000 Ci/mmol) was from Amersham Biosciences. 

Construction of kanMX4-Mu transposons 

The kanMX4 selector module (1.4 kb) was released from the pFA6-kanMX4 (Wach et al. 
1994) by Ecom + BgUl double digestion and ligated to the 0.75 kb vector containing the 
15 pUC miniorigin and the Mu ends, producing the kanMX4-Mu plasmid, pHTHl . Plasmid 
DNA was isolated with the Plasmid Maxi Kit (QIAGEN). To confirm the absence of 
mutations in the kanMX4 module the insert was sequenced following the in vitro 
transposition reaction with the Cam^ Enfranceposon as a donor DNA and the plasmid 
pHTH 1 as a target DNA with primers Muc 1 and Muc2. 

20 

The primers for sequencing the yeast constructs were Mucl : 
5'-GCTCTCCCCGTGGAGGTAAT-3' (SEQ ID NO:l) and Muc2: 
5'-TTCCGTCACAGGTATTTATTCGGT-3' (SEQ ID NO:2). 

25 We also constructed a transposon with a bacterial replicon between the Mu ends to allow 
easier outcloning. The pl5A replicon was cut from the plasmid pACYC184 (Rose 1988) 
with Sphl, blunted with Klenow enzyme, and ligated into £'coRI-cut end-filled pHTHl to 
produce kanMX4-pl5A-Mu plasmid, pHTH4. 

30 Construction of Mu/LoxP-Kan/Neo transposon 

A neomycin-resistance cassette containing a bacterial promoter, SV40 origin of replication, 
SV40 early promoter, kanamycin/neomycin resistance gene, and Herpes simplex virus 
thymidine kinase polyadenylation signals was generated by PGR from pIRES2-EGFP 
plasmid (Clontech). After addition of LoxP sites and Mu end sequences using standard 
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PCR-based techniques, the construct was cloned as a BgUl fragment into a vector 
backbone derived from pUC19. The construct (pALH28) was confimied by DNA 
sequencing. 

*• 

5 Assembly and concentration of transpososomes 

The transposons (kanMX4-Mu, 1.5 kb; kanMX4-pl5A-Mu, 2.3 kb; Mu/LoxP-Kan/Neo, 
2. 1 kb) were isolated by BgUl digestion from their respective carrier plasmids (pHTHl , 
pHTH4, pALH28). The DNA fragments were purified chromatographically as described 
(Haapa et al 1999a). 

10 

The standard in vitro transpososome assembly reaction (20 \xl or 80 |il) contained 55 nM 
transposon DNA fragment, 245 nM MuA, 150 mM Tris-HCl pH 6.0, 50 % (v/v) glycerol, 
0.025 % (w/v) Triton X-100, 150 mM NaCl, 0.1 mM EDTA. The reaction was carried out 
at 30°C for 2 h. The complexes were concentrated and desalted from several reactions by 
15 Centricon concentrator (Amicon) according to manufacturer's instructions and washed 
once with water. The final concentration for the yeast transformations was approximately 
tenfold and for the mouse transfections about 20-fold. 

Electrocompetent bacterial and yeast cells 

20 Electrocompetent bacterial cells for standard cloning were prepared and used as described 
(Lamberg et aL, 2002). Electrocompetent S. cerevisiae cells were grown as follows. An 
overnight stationary phase culture was diluted 1:10 000 in fresh YPD (1 % yeast extract, 2 
% peptone, 2 % glucose) and grown to Aeoo 0.7 - 1 .2. The cell pellets were collected by 
centrifugation (5000 rpm), suspended in volume of 0.1 M lithium acetate, 10 mM 

25 dithiotreitol, 10 mM Tris-HCl pH 7.5, 1 mM EDTA (LiAc/DTT/TE) and incubated at 

room temperature for 1 h. The repelleted cells were washed with ice-cold water and again 
collected by centrifiigation. The pellet was then resuspended in 1/10 of the original volume 
of ice-cold 1 M sorbitol. Following centrifiigation, the pellet was suspended in ice-cold 1 
M sorbitol to yield -200-fold concentration of the original culture density. One hundred 

30 microliters of cell suspension were used for each electroporation. For competence status 
determionation, transpososomes or plasmid DNA were added to the cell suspension and 
incubated on. ice for 5 min. The mixture was transferred to a 0.2 cm cuvette and pulsed at 
1.5 kV (diploid FY 1679) or 2,0 kV (haploid FY-3), 25 ^iF, 200 ohms with Bio-Rad 
Genepulser 11. After electroporation 1 ml of YPD was added, and cultures were incubated 



at 30°C for 0-4 hours. Subsequently cells were plated on YPD plates containing 200 jig/ml 
of G418. The competent status of the yeast strains was evaluated in parallel by 
electroporation of a control plasmid pYC2/CT (URA3, CEN6/ARSH4, amp^ pUC ori, 
Invitrogen) and plating the cells on minimal plates. 

5 

Mouse ES cell transfection and colony isolation 

The procedures used for electroporation of mouse AB2.2-Prime embryonic stem (ES) cells 
have been described by Sands and Hasty (1997). Briefly, the AB2.2-Prime ES cells were 
collected in phosphate-buffered salme (PBS) at a density of 1 1x10^ cells/ml. 2.2-2.3 ^g of 

10 the transposon complexes or linearized DNA was added to an 0.4 cm electroporation 
cuvette. For each electroporation, 0.9 ml of ES cell suspension (approximately 10 x 10^ 
cells) was mixed with transpososomes or linear DNA. Electroporation was carried out 
using Bio-Rad's Gene Pulser and Capacitance Extender at 250 V, 500 |iF. After 
electroporation the cells stood at RT for 10 min and were then plated in gelatin coated 

15 plates.The electroporated ES cells were cultured in the conditions mentioned above for 24- 
48 hours before adding G418 (Gibco) to a final concentration of 150 \xg/m\ to select 
transposon insertions. Transfected colonies of ES cells were picked after 10 days in 
selection and individual colonies were cultured in separate wells of the 96-wells or 24- 
wells plates using the conditions described above. 

20 

Isolation of genomic DNA 

Yeast Genomic DNA of each geneticin resistant yeast clone was isolated either with 
QIAGEN Genomic DNA Isolation kit or according to Sherman et al., 1981. 

25 Mouse ES cells Genomic DNA was isolated from ES cell essentially according to the 
method developed by Miller et al. (1988). ES cells were collected from individual wells 
from the 24-weIl cultures and suspended to 500 fal of the proteinase K digestion buffer (10 
mM Tris-HCl (pH 8.0), 400 mM NaCl, 10 mM EDTA, 0.5 % SDS, and 200 jag/ml 
proteinase K). The proteinase K treatment was carried out for 8-16 hours at 55°C. 

30 Following the proteinase K treatment 150 jal of 6 M NaCl was added followed by 
centriftigation at microcentrifiige (30 min, 13 K). The supernatant was collected and 
precipitated with ethanol to yield DNA pellet that was washed with 70% ethanol and air- 
dried. DNA was dissolved in TE (10 mM Tris-HCl, pH 8.0 and 1 mM EDTA) buffer. 
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Southern blot 

Yeast The DNA was digested with appropriate enzymes. The fragments were 
electrophoresed on a 0.8 % agarose gel and blotted onto Hybond N+ membrane 
(Amersham). Southern hybridisation was carried out with [a^^P]dCTP -labelled (Random 
5 Primed, Roche) kanMX4 (BglH-EcoM fragment) as a probe. 

Mouse ES cells DNA Southern blot hybridisation was performed according to standard 
methods as described (Sambrook, et al., 1989). 10-15 ]ig of the wild type and transfected 
AB2.2-Prime ES cell DNAs were digested with various restriction enzymes and separated 

10 on 0.8% agarose gels. The DNA was transferred to a nylon filter (Hybond N+, Amersham) 
and fixed with UV (Stratalinker, Statagene). Inserted DNA was visualized by hybridisation 
with a [a-^^p] dCTP-labeled (Rediprimell, Amersham) DNA probes (Mu/LoxP-Kan/Neo 
BamHI fragment). Hybridisation was performed at 65°C for 16 hours in solutions 
containing L5 x SSPE, 10% PEG 6000, 7% SDS, 100 ^g/ml denatured herring sperm 

15 DNA. After the hybridisation, the filter was washed twice 5 min and once 15 min in 

2xSSC, 0.5% SDS at 65°C and once or twice for 10 - 15 min m the O.lxSSC, 0.1%SDS at 
65°C. The filter was exposed to a Fuji phosphoimager screen for 8-16 hours and processed 
in a FujiBAS phosphoimager. 

20 Determination of target site duplication 

Cloning. Yeast genomic DNA was digested with BamUl + BgHI, Sallh Xhol or PvmII to 
produce a fragment with a transposon attached to its chromosomal DNA flanks. These 
fragments were then cloned into pUC19 cleaved with BamUl, SaK or Smal, respectively, 
selecting for kanamycin and ampicillin resistance. Altematively, clones transfected with 

25 kanMX4-pl5A were cleaved with BamHI + Bglll, ligated, electroporated and selected for 
resistance produced by the transposon containing fragments. DNA sequences of transposon 
borders were determined from these plasmids using transposon specific primers SeqA and 
SeqMX. Genomic locations were identified using the BLAST search at SGD 
(Saccharomyces Genome Database; http://genome-www.stanford.edu/Saccharomyces/) or 

30 SDSC Biology WorkBench (http://workbench.sdsc.edu/) servers. 



The primers for sequencing the ends of cloned yeast inserts were Seq A: 
5'-ATCAGCGGCCGCGATCC-3' (SEQ ID NO:3) and Seq MX4: 
5'-GGACGAGGCAAGCTAAACAG-3' (SEQ ID NO:4), 
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PCR amplification. Two micrograms of yeast genomic DNA was digested with BamHl 

or Nhel + SpeL Specific partially double-stranded adapters were made by annealing 
2 jiM adapter primer 1 (WAP-1) with complementary 2 ^iM adapter primer 2 (WAP-2*), 3 
(WAP-3*), or 4 (WAP-4*). The 3' OH group of the WAP-2*, WAP-3*, and WAP-4* 
primers was blocked by a primary amine group and the 5* ends were phosphorylated. The 
restriction fragments (200 ng) generated by BamHl + Bgltl were ligated with 22 ng of 
adapter that was made by annealing primers WAP-1 and WAP-2*, whereas the restriction 
fragments generated with Nhel + Spel were ligated with the 22 ng of adapter made by 
annealing primers WAP-1 and WAP-3*. One fifth of the ligation reaction was used as a 
template to perform PCR amplification at 20 [il to enrich for DNA firagments between the 
adapter and the transposon with primers Walker- 1 and TEFterm-1 or Walker- 1 and 
TEFprom-L PCR conditions were 94*^0, 1 min, 55 **C, 1 min, 72 °C, 4 min for 30 cycles. 
Nested PCR was carried out at 50 iiil using 2 \xl of one hundred-fold diluted primary PCR 
products as a template using primers Walker-2 and TEFterm-2 or Walker-2 and TEFprom- 
2 for PCR products produced fi-om BamUl + Bglll fragments and Walker-3 and TEFterm-2 
or Walker-3 and TEFprom-2 for PCR products produced from the Nhel + Spel fragments. 
The PCR conditions were as before. The amplified nested PCR products were sequenced 
using sequencing primer Mu-2, 

One microgram of mouse genomic DNA was digested with Bglll + BcH or Nhel + Spel. 
Specific partially double-stranded adapters were made as for the yeast. The restriction 
fragments (400 ng) generated by BcH + Bglll were ligated with 44 ng of adapter that was 
made by annealing primers WAP-1 and WAP-2*, whereas the restriction fragments (200 
ng) generated with Nhel + Spel were ligated with the 22 ng of adapter made by annealing 
primers WAP-1 and WAP-3*. Respectively, one fourth or one fifth of the ligation reaction 
was used as a template to perform PCR amplification at 20 [i\ to enrich for DNA fragments 
between the adapter and the transposon with primers Walker- 1 and HSP430 or Walker- 1 
and HSP43 1 . PCR conditions were 94°C, 1 min, 55 °C, 1 min, 72 °C, 4 min for 30 cycles. 
Nested PCR was carried out at 50 jal using 2 |al of eighty fold or one hundred-fold diluted 
primary PCR products as a template using primers Walker-2 and HSP429 or Walker-2 and. 
HSP432 for PCR products produced firom BcH + Bglll fragments and Walker-3 and 
HSP429 or Walker-3 and HSP432 for PCR products produced from the Nhel + Spel 
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fragments. The PCR conditions were as before. The amplified nested PGR products were 
sequenced using sequencing primer Mu-2. 

Primers for PCR-based detection: 
5 WAP-1 CTAATACCACTCACATAGGGCGGCCGCCCGGGC (SEQ ED NO:5) 
WAP-2* GATCGCCCGGGCG-NH2 (SEQ ID NO:6) 

WAP-3* CTAGGCCCGGGCG-NH2 (SEQ ID NO:7) 

10 WAP-4* AATTGCCCGGGCG-NH2(SEQIDNO:8) 



15 



20 



Walker-1 CTAATACCACTCACATAGGG (SEQ ED NO:9) . 

Walker-2 GGGCGGCCGCCCGGGCGATC (SEQ ID NO: 1 0) 

Walker-3 GGGCGGCCGCCCGGGCCTAG (SEQ ID NO: 1 1) 

Walker-4 GGGCGGCCGCCCGGGCAATT (SEQ ED NO: 12) 

TEFterm- 1 CTGTCGATTCGATACTAACG (SEQ ID NO: 1 3) 

TEFterm-2 CTCTAGATGATCAGCGGCCGCGATCCG (SEQ ID NO: 14) 

TEFprom-l TGTCAAGGAGGGTATTCTGG (SEQ ED NO:15) 

TEFprom-2 GGTGACCCGGCGGGGACGAGGC (SEQ ID NO: 1 6) 



Mu-2 



GATCCGTTTTCGCATTTATCGTG (SEQ ID NO: 17) 



25 



30 



HSP429 GGCCGCATCGATAAGCTTGGGCTGCAGG (SEQ ID NO: 1 8) 

HSP430 ACATTGGGTGGAAACATTCC (SEQ ID NO: 19) 

HSP431 CCAAGTTCGGGTGAAGGC (SEQ ID NO:20) 

HSP432 CCCCGGGCGAGTCTAGGGCCGC (SEQ ID NO:2 1) 
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RESULTS 

Transposon construction and its introduction to the cells 
5 To study if the Mu transposition system works also for eukaryotes (Figure 1) we 
constructed a kanMX4-Mu transposon containing the kan^ gene from TnPOJ and 
translational control sequences of the TEF gene of Ashbya gossypii between the Mu ends, 
with or without additional bacterial pl5A replicon between the Mu ends (Figure 2A). We 
studied the assembly of Mu transpososomes by incubating MuA protein with the kanMX4- 

10 Mu transposon and detected stable protein-DNA complexes by agarose gel electrophoresis 
(Figure 3). The reactions with kanMX4-Mu and kanMX-pl5A-Mu transposons produced 
several bands of protein-DNA complexes that disappeared when the sample was loaded in 
the presence of SDS indicating that only non-covalent protein-DNA interactions were 
involved in the complexes. An aliquot of assembly reactions with and without MuA 

15 transposase were electroporated into Saccharomyces cerevisiae cells and the yeasts were 
scored for geneticin resistance. The competent status of the yeast strains was evaluated in 
parallel by electroporation of a control plasmid pYC2/CT. The electroporation efficiency 
with the transpososomes into the yeast was approximately three orders of magnitude lower 
than the efficiency with the plasmid (Table 1). This result is consistent with previous 

20 results with bacteria (Lamberg et al 2002). Only the sample containing detectable protein- 
DNA complexes yielded geneticin resistant colonies. 

For mouse experiments we constructed a Mu/loxP-Kan/Neo transposon that contained 
bacterial and eukaryotic promoters, kanamycin/neomycin resistance gene, and Herpes 
25 simplex virus thymidine kinase polyadenylation signals (Figure 2B). The transfection of 
the mouse ES cells with the transpososome resulted in 1720 G418 resistant colonies per [xg 
DNA and the linear control in 330 resistant colonies per \i% DNA. Thus the transfection 
with the transpososome yielded over 5 times more resistant colonies per ^ig DNA. The 
control cells with no added DNA did not produce any resistant colonies. 

30 

Integration of the transposon into the genome 

Southern blot analysis can be used to study whether the transposon DNA was inserted into 
the genomic DNA of the geneticin-resistant colonies. Digestion of genomic DNA with 
enzyme(s) which do not cut the transposon produces one fragment hybridising to the 
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transposon probe, and digestion with an enzyme which cuts the transposon once produces 
two fragments in the case of genuine Mu transposition. Genomic DNA from 17 kanMX4- 
Mu transposon integration yeast clones was isolated, digested with BamlU + BglU that do 
not cut the transposon sequence, or with HindSl that cleaves the transposon sequence once 

5 and analysed by Southern hybridisation with kanMX4 fragment as the probe. Fifteen 
isolates generated a single band with a discrete but different gel mobility after BamYLl + 
BgUl digestion (Figure 4A) and two bands after //mdlll digestion (Figure 4B). Control 
DNA from the recipient strain FY 1679 did not generate detectable bands in the analyses. 
Two isolates (G5 and G14) gave several hybridising fragments after 5amHI + BgUl 

10 digestion suggesting possibility of multiple transposon integrations. However, these two 
isolates gave three fragments after if/wdlll digestion, instead of doubling the amount of 
fragments detected in the BamHI + BgUl digestion expected in case of multiple transposon 
integrations. The sizes of the HindlU fragments of the isolates G5 and G14 (4.3, 2.4 and 
1.3 kb) and the pattern of bands in BamHI + BgUl digestion suggested that the transposon 

1 5 was integrated into the yeast 2\i plasmid (for confirmation of this see sequencing results 
below). Genomic DNA from 17 G418-resistant isolates of the haploid strain FY-3 was 
analysed in a similar way after Xhol + Sail digestion (which do not cut the transposon) and 
Pstl digestion (one cut in the transposon). Thirteen isolates gave one band after ^ol + 5^/1 
digestion and two bands after Pstl digestion suggesting a single integration. Four isolates 

20 gave similar pattern of bands as isolates G5 and G14 of strain FY 1679 suggesting 

integration into the 2|i plasmid (results not shown). These data indicate that in most of the 
studied clones the transposon DNA was integrated as a single copy into the yeast 
chromosome. In the rest of the clones a single integration was detected in an episome. 

25 Seven mouse ES cell clones were analysed by Southem blotting. Their chromosomal DNA 
was digested with BamHI which releases almost an entire transposon from the genome. All 
the clones studied had a band at the same position as the BamHI digested pALH28 used as 
a control. The intensity of the band was similar for all clones studied and for control DNA 
representing same molar amount of DNA as the genomic samples. This suggests that only 

30 one copy of the transposon was integrated into each genome. 



The location of insertions in the chromosomes 

Yeast Mu transposons integrate almost randomly into the target DNA (Haapa-Paananen et 
al'., 2002). To test the location and distribution of the transposon insertions we cloned 



transposon-genomic DNA borders from more than one hundred yeast transformants and 
sequenced the. insertion sites on both sides of the transposon using transposon-specific 
primers (Seq A + Seq MX4). Exact mapping of the insertion sites was possible by BLAST 
comparison with the SGD database. We used the strain FY 1679 which was used in the 
5 yeast whole genome sequencing (Winston et al. 1995) to ensure the correct mapping. The 
overall distribution of 140 integrations on the 16 chromosomes of the yeast is shown in 
Figure 5A. All chromosomes were hit at least once. Both ORFs and intergenic regions had 
transposon integrations (Table 2), List of integrations into the genome is presented in Table 
3. In the haploid genome, integrations on the essential genes were naturally missed due to 

10 the inviability of the cells. On chromosome XII there seems to be a real "hotspot" for 

transposon integration but this is an artefact since the "hotspof * is in the approximately 9 
kb region encoding ribosomal RNA (Figure 5B). This loci is repeated tandemly 100-200 
times in the chromosome XII. In this region, the integrations are distributed randomly. The 
chromosomes in Figure 5 A are drawn according to SGD which shows only two copies of 

15 this repeated region (when the systematic sequencing of the yeast genome was done, only 
two rDNA repeats were sequenced) instead of 100 to 200 copies actually present in a yeast 
genome consisting of 1 to 2 Mb of DNA. Only nine integrations were found at a distance 
less than 1 kb from a tRNA gene which shows that Mu-transposon integration differs from 
that of Tyl-Ty4 elements. Integration closest to the end of a chromosome was 6.3 kb. 

20 showing the difference to the telomere-preferring Ty5 element. The mean interval distance 
of insertions was 135 kb and was nowhere near covering the whole genome as a library. 
However, the distribution was even enough to show the randomness of the integration. 

Mouse The sequenced transposon-genomic DNA borders were compared to the Mouse 
25 Genome Assembly v 3 using Ensembl Mouse Genome Server. The clone RGC57 

contained an integrated transposon in the chromosome 3, duplicating positions 59433906- 
10, which are located in the last intron of both the ENSMUSESTGOOOOOO 10433 and 
10426. Sequencing showed presence of this 5-bp sequence (target site duplication) on both 
sides of the integrated transposon. 

30 

Integration of the transposon in the yeast 2fi plasmid 

Most S, cerevisiae strains contain an endogenous 2\i plasmid. The yeast 2p. plasmid is a 
6318 bp circular species present extrachromosomally in i^. cerevisiae at 60-100 copies per 
cell. The plasmid molecules are resident in the nucleus as minichromosomes with standard 
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nucleosome phasing (Livingston and Hahne 1979; Nelson and Fangman 1979; Taketo et 
al., 1980). 

In 23 clones out of 13 1 clones (17.6 %) the transposon had integrated in the 2|x plasmid 
5 and in 108 clones (82.4 %) the transposon had integrated into the chromosonnies in the 

diploid strain FY1679. In the haploid strain FY-3, four clones out of 49 clones (8.2 %) had 
the transposon in the 2ji plasmid and 45 clones (91 .8 %) had the transposon in the 
chromosomes. 

10 Transposon target site 

Genuine Mu transposition produces a 5-bp target site duplication flanking the integrated 
transposon (Haapa et al. 1999b). The transposon was flanked by target site duplication in 
121 clones (out of 122; 99.2 %) of the strain FY1679 and in 42 clones (out of 46; 91.3 %) 
in the haploid strain FY-3 confirming that a majority of integrations were generated by 

15 DNA transposition chemistry. A consensus sequence of 5 bp duplication (5'-N-Y-G/C-R- 
N-3') has been observed in both in vivo and in vitro transposition reactions, the most 
preferred pentamers being 5'-C-Y-G/C-R-G- 3' (Mizuuchi and Mizuuchi 1993; Haapa- 
Paananen et at. 2002; Butterfield et al. 2002). In this study, the distribution of nucleotides 
in duplicated pentamers agreed well with the previous results (Table 4). 
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Table 1. Number of geneticin-resistant colonies detected following electroporation of 
transpososomes into yeast strains, cfu/ ng DNA 

DNA FY1679 FY-3 

KanMX-Mu +MuA 351 178 

KanMX-Mu - MuA 0 1 

KanMX-pl5A-Mu + MuA 43 61 

KanMX-pl5A-Mu - MuA 0 0 

Plasmid PYC2/CT' 6.9 x 10^ 5.6 x 10^ 

' Electroporation of plasmid pYC2/CT DNA served as a control for competent status. 



Table 2. Distribution of transposon integrations in FY1679 (diploid) and 
FY-3 (haploid) strains. 

Integration site FY 1679 FY-3 Total 

Chromosomal DNA 

Protein coding region 53 
Essential gene 12 (1 intron) 0 



Nonessential gene 


29 


11 




rRNA 


12 


7 


19 


tRNA (intron) 


1 


0 


1 


Ty 


2 


0 


2 


Solo-LTR 


1 


2 


3 


Intergenic region 


48 


23 


71 


2\x plasmid 








Protein coding region 


4 


2 


6 


Intereenic region 


12 


2 


14 




121 


47 


169 
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Table 3A. Transposon integration sites and target site duplications in 
Saccharomyces cerevisiae diploid strain FY1679. 

-seqmx4 seqA^ Location* 



Gl caacatc t agCTCAG ( KanMX4 -Mu 

G2 agtactaccaTTGAA (KanMX4-Mu 

G3 taaaaattcaGGCAT (KanMX4-Mu 

G4 taaaccaccaTCTGT (KanMX4 -Mu 

G5 ctgattactaGCGAA (KanMX4-Mu 

G6 aagaaaagctCAGTG (KanMX4-Mu 

G7 gaactctttcCCCAC (KanMX4-Mu 

G8 aaaga tgaaaCCGAG ( KanMX4 -Mu 

G9 caatgca t caTCTAC ( KanMX4 -Mu 

GIO tttgttcacgCGGGC (KanMX4-Mu 

Gil atctgtattaACTTC (KanMX4-Mu 

G12 ttttcatgttCCTAT(KanMX4~Mu 

G13 tatccacttcTTAGA{KanMX4-Mu 

G14 aaactgt tttACAGA (KanMX4 -Mu 

G15 tggagttaggCTGGC (KanMX4-Mu 

G16 gagcttctgcTTCAC (KanMX4-Mu 

G17 taacgctagaGGGGC (KanMX4-Mu 

G18 tccaaccgtaGTGGT (KanMX4 -Mu 

Gl 9 gggggcaa tgGTGAA ( KanMX4 -Mu 

G20 taagagcttgTCCGC (KanMX4-Mu 

G21 cataagtgtaAGCCA(KanMX4-Mu 

G22 tctggcttaaACCAG(KanMX4-Mu 

G23 gttgaatcttCCGAT(KanMX4-Mu 

G34 ccctagcgccTAGGG (KanMX4-Mu 

G36 ttgctttaacTAGGA (KanMX4-Mu 

G37 agagactgaaGACGA (KanMX4-Mu 

G3 8 atgga tggcgCTCAA ( KanMX4 -Mu 

G40 tccatcttctGTGGA (KanMX4 -Mu 

G41 ttcactcattCTGGT (KanMX4-Mu 

G42 ctagcgctttACGGA (KanMX4-Mu 

G43 ggtaataggcCCGTG (KanMX4-Mu 

G44 gtggtgccctTCCGT (KanMX4-Mu 

G45 ttcgctgctcACCAA (KanMX4-Mu 

G46 aatattatctTCTGT (KanMX4-Mu 

G47 gtatgtacccACCGA (KanMX4-Mu 

G48 gttgatggtaCCTTG (KanMX4 -Mu 

G49 tacattgtctTCCGT (KanMX4-Mu 

G50 ccgtggaagcCTCGC (KanMX4-Mu 

G51 tttcttttccTCCGC (KanMX4-Mu 

G52 gctgcgtctgACCAA (KanMX4-Mu 

G53 tactgttgaaCCGGG (KanMX4-Mu 

G54 caaatgtatcAGCAG (KanMX4-Mu 

G55 agtttccgctATAAA (KanMX4-Mu 

G56 aaaggaattgCTAGG (KanMX4-Mu 

G57 aaaaataattACTCT {KanMX4 -Mu 

G58 tgtttatatgATGAC {KanMX4 -Mu 

G59 ttgtgtatttTTGAT (KanMX4-Mu 

G60 tatgataatcAAGGC (KanMX4-Mu 

G63 cagcattaaaACGGC (KanMX4-Mu 

G64 ttgacatgtgATCTG (KanMX4 -Mu 

G65 tcagctctcaGCAGA(KanMX4-Mu 

G66 tgctaggtgtGTCTG (KanMX4 -Mu 

G67 caattgaggtTTGAA {KanMX4 -Mu 

G67 aatcatgcatTGCAT (KanMX4-Mu 

G70 acgatcttacGTCGG (KanMX4-Mu 

G71 ttgtatttaaACTGG (KanMX4-Mu 

G74 tgcatatttgCCTGC (KanMX4-Mu 

G75 tcgttgaataATGGA(KanMX4-Mu 



CTCAGtgagttccga 
TTGAAtttacgttca 
GGCATatacaattat 
TCTGTcgcccatctt 
GCGAAgctgcgggtg 
CAGTGgaataatttt 
CCCACcgatccattg 
CCGAGtaagctgcta 
TCTACattacaaacc 
CGGGCcgcagttgtg 
ACTTCgaggtagtaa 
CCTATtcttgttctt 
TTAGAgggactatcg 
ACAGA 1 1 1 acga t eg 
CTGGC t cggac tggc 
TTCACgttttttgga 
GGGGCaagaaggaag 
GTGGTtatataataa 
GTGAAatttcgacgc 
TCCGCttcgccccaa 
AGCCAtatgttccct 
ACCAGcactatgtat 
CCGATaccatcgaca 
TAGGGtcgagtactg 
TAGGAaagaataaga 
G ACGAgga aa t c aa a 
CTCAAgcgtgttacc 
GTGGAgaagactcga 
CTGGTcatttcttcg 
ACGGAagacaatgta 
CCGTGcggttccgtc 
TCCGTcaattccttt 
ACCAAtggaatcgca 
TCTGTcattgttact 
ACCGAtgtagcagta 
CCTTGacaccagcca 
TCCGTaaagcgctag 
CTCGCccgatgagtt 
TCCGCttattgatat 
ACCAAggccctcact 
CCGGGtcgtacaact 
AGCAGatgtacttcc 
ATAAAt aa tggcagc 
CTAGGggcattactc 
ACTCTaacatttctt 
ATGACgattttccca 
TTGATtgaaaatgat 
AAGGCataattgact 
ACGGCagcaaagccc 
ATCGTcacagatttt 
GCAGAgaaaaaattt 
GTCTGtttatgcatt 
TTGAAattgctggcc 
TGCATaatgtggtat 
GTCGGctatctcacG 
ACTGGagtgatttat 
CCTGCgaaaaaaagt 
ATGGAaaatatgaaa 



Chrl3 : 908424-908428 
chr9:279340-279344 
Chris : 569334-568338 
chrl2 : 23 9388-2393 92 
211:3447-3451 (NC_0013 98) 
Chr4: 825525-825529 
Chrl6 : 862127-862131 
chr3 : 263950-263954 
Chr2 : 766314-766318 
chrll : 3 08515-3 08519 
chr7 : 854983-854987 
chr5: 327111-327115 
chrl2: 456350-456354 
2]l: 2720-2724 
chrlO : 702930-702934 
Chr7 : 568606-568610 
chrl : 136875-136879 
Chr 10: 241383-241387 
chr4 :276367-276371 
Chrl3 : 904363-904367 
chr9 : 24 9583-2495 87 
chr4 : 544898-544902 
chrl2 : 65144-65148 
chr9: 138283-138287 
chrl5 : 8 92270-8922 74 
chrl6 : 67656-6966 0 
chrl2 :453865-453869 
chrl4 :661338-661342 
chrl5 :720163-720167 
2vi:2838-2842 
Chrl5 : 83 6789-836793 
chrl2 :456583-456587 
chrl2 :458164-458168 
chrlO : 135624-135628 
chrl5 : 82903 9-829043 
chr6:44321-44325 
211:2838-2842 
chrlO : 526881-526885 
chrl2 :455126-455130 
chrl2 :453213-453217 
chrl4 : 73 6161-736165 
chrl4 : 566860-5668 64 
chrlO :161496-161500 
chrl2 :912615-912619 
chrl6 :120160-120164 
chrll: 306835-306839 
chr4 : 600461-6004 6 5 
chr2 : 42 9112-429116 
chrl6 :826635-826639 
211:5268-5272 
chr2 : 117272-117276 
chrl4 : 33143 2-3 31436 
chr. 12:455361-455365 
211:2196-2200 
Chr3: 77666-77670 
2]i A:5800-5804 
chr5 : 43 6799-43 6803 
ChrlO : 187594-187598 
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Table 3 A (Continued) 



G76 ctttcccagaACCAG I 
G77 cctctgcatcCCAAC j 
G78 atctgtaaacTCGCT ( 
G79 tcctgcctaaACAGG < 
G80 tagaaaaaacCACAA I 
G81 ttttggctcgTCCGG I 
G83 tgtggctaccGCCCG I 
G84 ggcatagtgcGTGTT \ 
G8 5 aaaatgcaacGCGAG { 
G87 gaacagttccACGCC ( 
G88 agcgcgactgCCCGA I 
G90 aaaaggttcaGTAGA i 
G94 ccacaaggacGCCTT i 
G96 cagaatccatGCTAG i 
G97 cagctgctacCCAGG i 
G98 ctagccgttcATCAA i 
G99 caaaaaagtcTAGAG ^ 
GlOO ttgtcaaagtACCGA 
GlOl gtaacatcttGGGCG 
G102 actgcctttgCTGAG 
G103 aatgtaaaagGCAAG 
G104 gcctgaattgTAGAT 
G105 gtttgacattGTGAA 
G106 tgtcatctacATCAT 
G107 cttgttcctaGTGGC 
G108 agggccctcaGTGAT 
G109 ggtattttcaTTGGT 
GllO caatctaaccACCAT 
Gill cgaaaaatgcACCGG 
Gil 3 ttacgatctgCTGAG 
G114 aaatcgagcaATCAC 
G116 ccgacaaaccCCCCC 
G117 caataagatgTGGGG 
G118 gtttaacgctTCCTG 
G120 atgaatactcCTCCC 
G121 aatcacaatgGCGGC 
G122 gagcaccacgATCGT 
G123 aaaagcattcTGCAG 
G124 gtgattctccATGGG 
G125 gctggtccagACCAC 
G126 acttcgacttCGGGT 
G127 tgacattaatCCTAC 
G128 tttatatccgGTGGT 
G129 ctgatgtgcgGTGGT 
G130 gttgaactacTACGG 
G131 cctatactctACCGT 
G132 aactagcaaaATGGA 
G133 ttgactcaacACGGG 
G134 cattgtgaccCTGGC 
G13 5 atacagctcaCTGTT 
G136 tcagatttttCCCAG 
G137 tttaacgtggGCGAA 
G138 ccattccataTCTGT 
G140 ctttgtgcgcTCTAT 
G150 aattggtacaGTATG 
Tl ttgtagcttcCACAA 
T2 tcttattctcCTGTT 
T3 cggttgtataTGCAT 
T4 ttttaataagGCAAT 



KanMX4-Mu) ACCAGggaaactgtt 
KanMX4-Mu) CCAACaccagcgata 
:KanMX4-Mu) TCGCTtgtgacgatg 
:KanMX4 -Mu) ACAGGaagacaaagc 
KanMX4 -Mu ) CACAAcaacac tatg 
;KanMX4 -Mu) TCCGGatgatgcgaa 
;KanMX4«-Mu) GCCCGtgattcgggc 
;KanMX4-Mu) GTGTTtatgcttaaa 
;KanMX4-Mu) GCGAGagcgctaatt 
; KanMX4 -Mu ) ACGCCtga t atgagg 
;KanMX4-Mu) CCCGAagaaggacgc 
: KanMX4 -Mu ) GTAGAaacat aaaat 
:KanMX4-Mu) GCCTTattcgtatcc 
;KanMX4 -Mu) GCTAGaacgcggtga 
;KanMX4-Mu) CCAGGgattgccacg 
: KanMX4 -Mu) ATCAAtcatgt caaa 
; KanMX4 -Mu) TAGAGgaaaaaaacg 
:KanMX4-Mu) ACCGAtcatgacaat 
;KanMX4-Mu) GGGCGtttgcaacac 
:KanMX4-Mu) CTGAGctggatcaat 
[KanMX4-Mu) GCAAGaaaacatgta 
;KanMX4-Mu) TAGATattagataag 
I KanMX4 -Mu) GTGAAgagacataga 
[KanMX4-Mu) ATCATcggtattatt 
[KanMX4 -Mu) GTGGCgctaatggga 
[KanMX4-Mu) GTGATggtgttttgt 
[KanMX4-Mu) TTGGTtgt aaaat eg 
[ KanMX4 -Mu) ACCATgttggctcac 
[ KanMX4 -Mu) ACCGGccgcgcatta 
[KanMX4-Mu) CTGAGattaagcctt 
[KanMX4-Mu) GTGATtgctcgattt 
(KanMX4-Mu) CCCCCcatttatata 
[KanMX4-Mu) TGGGGattagtttcg 
[KanMX4 -Mu) TCCTGggaactgcag 
[KanMX4-Mu) CTCCCttgctgttgg 
[KanMX4-Mu) GCGGCcatcgaccct 
(KanMX4-Mu) ATCGTtcggtgtact 
( KanMX4 -Mu) TGCAGtaattagccg 
[KanMX4-Mu) ATGGGtggtttcgct 
(KanMX4 -Mu) ACCACaaaaggatgc 
(KanMX4-Mu) CGGGTaaaatactct 
(KanMX4-Mu) CCTACgtgacttaca 
(KanMX4-Mu) GTGGTtgcgataagg 
( KanMX4 -Mu) GTGGGccttggact t 
(KanMX4-Mu) TACGGttaagggtgc 
(KanMX4 -Mu) ACCGTcagggttgat 
( KanMX4 -Mu) ATGGAaacaaaaaaa 
( KanMX4 -Mu) ACGGGgaaactcacc 
(KanMX4-Mu) CTGGCaaatttgcaa 
(KanMX4-Mu) CTGTTcacgtcgcac 
(KanMX4-Mu) CCCAGtatggctttg 
(KanMX4-Mu) GCGAAgaagaaggaa 
(KanMX4-Mu) TCTGTtaagtataca 
(KanMX4-Mu) TCTATaatgcagtct 
( KanMX4 -Mu ) GTATGct caaaaata 
(Mu-KanMX4-pl5A-Mu) CACAAgatgttggct 
(Mu-KanMX4-pl5A-Mu) CTGTTgccttcgtac 
(Mu-KanMX4-pl5A-Mu)TGCATtgtacgtgcg 
(Mu-KanMX4-pl5A-Mu) GCAATaatattaggt 



Chrl4 : 537718-537722 
chr4 : 955105-955109 
chr4 : 480341-480435 
Chrl4 : 547141-547145 
chrlO : 111531-111535 
chr.l6 :641397-641401 
chr4 : 1433822-1433826 
2vi:541-5'45 
2vi:3134-3138 
chrll : 60765-60769 
Chr4 : 1056229-1056233 
chrll: 430889-430893 
chrl2 : 451993-451997 
chrl2 :452043-452047 
Chr2 : 415433-415437 
Chr4 : 53 935 6-53 93 60 
Chrl3 :406197-406201 
chr5: 258808-258812 
chrl6 : 135372-135376 
2vL:2524-2528 
chr4 : 1011940-1011944 
chrl5 : 770712-770716 
Chrl2 : 452744-452748 
chr4 : 116084 7-1160851 
Chr4 : 4 64844-464848 
2]l B:4396-4400 
chrl2 : 582690-582694 
Chris : 75760-75764 
2iJi:5427-5431 
Chrl2 : 451812 -4 51816 
2p:2126-2130 
Chris : 1039713-1039717 
chrl3 :895900-895904 
Chrl6:30277-30281 
Chrl4 :175588-175592 
chrl2 : 103 0 93 3-1030937 
Chrl3 : 67812-67816 
Chris : 638922-638926 
Chrl4 : 333823-333827 
Chrl3 : 5405 87-54 0591 
Chrl2 : 328174-328178 
Chr5 : 291453 -291457 
chrS : 317469-317473 
chrS: 336404-336408 
chrl6:40318-40322 
chrl2 : 453842-453846 
chr2 : 692 001-692005 
Chrl2 :456534-456538 
Chrl2 :651930-651934 
2\X B:4039-4043 
Chr7 : 976865-976869 
chrll : 327312-327316 
chrl2 ;460247-460251 
2vi:3318-3322 
Chrl2 :492584-492588 
Chrl2 : 645643-645647 
chrS: 7908-7912 
ChrS : 402 75 0-402754 
chrlO: 538071-538075 
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Table 3A (Continued) 



T5 tatcacttacTCGAA(Mu-KanMX4-pl5A-Mu)TCGAAcgttgacatt chrl2 : 864259-864263 
T6 aaagacatctACCGT(Mu-KanMX4-pl5A-Mu) ACCGTgaaggtgccg chr7 : 999996-1000000 
T7 catattactgCCCGC(Mu-KanMX4-pl5A-Mu)CCCGCgtaatccaat chrl5 : 304883-304887 
T8 gtgttagtgaAT GCC(Mu-KanMX4-pl5A-Mu)ATGCCtcaaactctt chrlO : 304087-304091 
Target site duplication is typed in capital letters. 
♦Chromosome and the coordinates of the duplicated sequence. 

Table 3B. Transposon integration sites and target site duplications in 
Saccharomyces cerevlslae haploid strain Fy-3. 



«~seqmx4 



seqA- 



Location* 





a a a g a ^ a a a cLM. X MALt 


is.anMX4 


-Mu) 




CC U U C U C C U CO 1 viVjij 


KanMX4 


-Mu) 




acccacccciKjtdijC 


,KanMX4 


-Mu) 


no. 


^ a a ^ 4" ^ ^ 1" ^ ^"1*/^ TV 


KanMX4 


-Mu) 


V73 


y a U U C a, C. C a. U V„ Atj 1 


K.anMA4 


-Mu) 




gaattttaaGAGAtc 


[KanMX4 


-Mu) 


an 

VJ / 


gttcgatgctGTGCG 


[KanMX4 


-Mu) 




cttcacggtaACGTA 


:KanMX4 


-Mu) 


no 


caaggagcagAGGGC 


;KanMX4 


-Mu) 


m n 
V7X yj 


tcaataaacaGCCGA 


[KanMX4 


-Mu) 


w X J. 


gcgagatgagGTGAA 


[KanMX4 


-Mu) 




taaatttcatCCGGA 


tKanMX4 


-Mu) 


ox J 


agaaaagt acAATTc 


:KanMX4 


-Mu) 


VjX4 


actgtcttttCCGGT 


;KanMX4 


-Mu) 


oXD 


atacacgctcATCAG 


[KanMX4 


-Mu) 


\sX o 


atagtatttcCTAGT 


(KanMX4 


-Mu) 


VjrX / 


ttcctattctCTAGA 


(KanMX4 


-Mu) 


no n 


ttataaggttGTTTC 


:KanMX4 


-Mu) 




ttcgagagtgCCATT 


:KanMX4 


-Mu) 


KStJ O 


atggatggcgCTCAA 


[KanMX4 


-Mu) 




tccaaatgtaTTGTG 


(KanMX4 


-Mu) 


U 


atgattatttCACGG 


(KanMX4 


-Mu) 




atggaaaactAGCGC 


[KanMX4 


-Mu) 


o/i "a 
o4 J 


gagaatcttgTCTTG 


[KanMX4 


-Mu) 


V»44 


tagcaaacgTAAGTCTtc (KanMX4- 


G45 


ttgccgcgaaGCTAC 


(KanMX4 


-Mu) 


G4 6 


gtagctctttTCCAT 


(KanMX4 


-Mu) 


G47 


atgttcattcTCTGT 


{KanMX4 


-Mu) 


G4 8 


aatcgtaaccATAAA 


(KanMX4 


-Mu) 


G4 9 


ccttcctgctGTGGG 


(KanMX4 


-Mu) 


G50 


tcttagggt t ATTGG 


(KanMX4 


-Mu) 


G51 


agttaacttcCCCGG 


(KanMX4 


-Mu) 


G52 


atgtgtcattGAGGG 


(KanMX4 


-Mu) 


G53 


ggttaacttgCTCGC 


:KanMX4 


-Mu) 


G54 


caaaaaaagaTGGAG 


(KanMX4 


-Mu) 


G55 


gatatttacgCTTAT 


:KanMX4 


-Mu) 


G56 


gccgtggtttCCGGA 


:KanMX4 


-Mu) 


G57 


tttctggaatTAGGG 


(KanMX4 


-Mu) 


G58 


attactttatTTGGC 


(KanMX4 


-Mu) 


G59 


cgttatcataTTGAT 


(KanMX4 


-Mu) 


G60 


ggcaaactatCTCAC 


(KanMX4 


-Mu) 


G61 


ctaatagtgcATGAT 


(KanMX4 


-Mu) 


G62 


agaaattctcCTTGG 


(KanMX4 


-Mu) 


G63 


tcccgcactgGTGAT 


(KanMX4 


-Mu) 


G64 


atcattcattGCCGG 


(KanMX4 


-Mu) 


G65 


ctcacqctctGCGAT 


(KanMX4 


-Mu) 



i) ATAAGaaaatcttct 
i) GTGGGaaccgcttta 
i) (GCTGCttttccttaa) 
i) CTCATttgaccgagg 
0 GCAGTaatactaata 
i) GAtcAAgtcttgtga 
i) GTGCGggacttctac 
i) ACGTAactgaatgtg 
i) AGGGCacaaaacacc 
i) GCCGAcatacatccc 
i) GTGAAaagaaactta 
i) CCGGAagaaaaatga 
i) gATcAaggttacggc 
i) CCGGTcattccaaca 
i) ATCAGacaccacaaa 
i) CTAGTgatctcggcg 
i) CTAGAaagtatagga 
i) gaGTTTCatatgtgttt 
i) CCATTgtaccagact 
0 CTCAAgcgtgttacc 
i) TTGTGagatgaaaat 
i) CACGGatttcattag 
i) AGCGCataattttgt 
i) TCTTGatgtaacaaa 

-Mu) gAAGTCTAAaggttg 
i) GCTACcatccgctgg 
i) TCCATggatggacga 
i) TCTGTagcagtaaga 
i) ATAAAtataagttcc 
i) GTGGGcagagagcga 
i) ATTGGtagggt tttg 
i) CCCGGtgttcagtat 
i) GAGGGaaaatgtaat 
i) CTCGCcatatatatc 
i) TGGAGtacagtacgc 
i) CTTATcaatctctgg 
i) CCGGAgaaagacgaa 
i) TAGGGtgacagaatg 
i) TTGGCtaaagatcct 
i) TTGAtattgcttatt 
i) CTCACcagaggtctg 
i) ATGATtatatatcaa 
i) CTTGGgattagataa 
0 GTGATacctacaccc 
i) GCCGGaaaaagaaag 
i) GCGATtaacaqctca 



Chr3 :38982-38986 
2\x: A:4372-4376 
2p:5349-5353 
Chrl6 : 837554-83 7558 
chr4: 3069-3073 
Chris : 144910-144915 
Chrl: 191076-191080 
Chrl2: 453541-453545 
Chrl2 ;454727-424731 
2u: 5123-5127 
chr7 : 284048-284052 
Chrll: 489457-4 89461 
Chr4: 56735-56740 
chrll: 428648-428652 
chrl2: 4 53989-4 53993 
Chris : 989676-989680 
2p:704-708 
Chris : 854340-854344 
chr8 :4891S5-489159 
chrl2 : 4 53865-4 53869 
Chris : 83488 8-834 892 
chrl3:97657-97661 
Chr4 :437081-437085 
chr7 :190765-190769 
chrl2 : 459205-459213 
chrl2 : 4 52091-4 52 095 
Chrl2 : 6454 93-6454 97 
chrl0:337762-337766 ' 
chr2 :806825-806829 
chr7 : 73 9278-73 9278 
Chr9 : 3 82384-3 82 388 
Chrl2 : 1025073-1025077 
Chr7 : 798084-798088 
Chr2 :657457-657461 
Chr2 :466108-466112 
chr2 :80588-80592 
chrl3 : 347229-34 7233 
Chr4 :722468-722472 
Chr4:600407-600411 
chrl5: 696010-696013 
chrlO: 117057-117061 
Chr7 : 8 53604-8 53608 
Chr5 :137549-137553 
Chrl2: 213298-213302 
Chrl2: 370966-370970 
chrlO: 404834-404838 



-y-- ^«jt--*v-«.L-j.v_.i.* in udpxtax xecceri 

♦Chromosome and the coordinates of the duplicated sequence. 
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Table 4. Nucleotide consensus of the sequenced duplicated pentamers. 
(Percentage) 

FY1679 (n=121): 



Nucleotide 1 


2 


3 


4 


5 


A 34 (28) 
C 31(26) 

G 28 (23) 
T 28 f23^ 


10 (8) 
58 (48) 

1 1 few 

11 (9) 
42 (35) 


13(11) 
45 (37) 
49 (41) 
14(12) 


47 (39) 

8 (7) 

53 (44) 
13(11) 


27 (22) 
27 (22) 
36 (30) 
31 (26) 


Consensus: N 


C/T 


C/G 


A/G 


N 


FY-3 (n=42): 
Nucleotide 1 


2 


3 


4 


5 


A 8 (19) 
C 14 (33) 
G 12 (28) 
T 8n9) 


3(7) 
15 (36) 

3 (7) 
21 (50) 


6(14) 
1 1 (26) 
18 (42) 

7(18) 


15 (36) 

1(2) 
22 (51) 

4rio) 


8(19) 
7(17) 
15 (35) 
12 (29) 


Consensus: N 


w 1 


w(jr 


A/G 


N 


FY1679 + FY-3 (n= 
Nucleotide 1 


=163): 

2 


3 


4 


5 


A 42 (26) 
C 45 (28) 
G 40 (25) 
T 36 (22^ 


13(8) 
73 (45) 
14 (9) 
63 (39) 


19(12) 
56 (34) 
67 (41) 
21 (13) 


62 (38) 
9(6) 
75 (46) 
17fl0) 


35 (21) 
34 (21) 
51(31) 
43 f26) 


Consensus: N 
N 


C/T 




C/G 



A/G 
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We claim: 

1. A method for incorporating nucleic acid segments into cellular nucleic acid of a 
eukaryotic target cell, the method comprising the step of: 

5 

delivering into the eukaryotic target cell a Mu transposition complex that comprises 
(i) MuA transposases and (ii) a transposon segment that comprises a pair of Mu end 
sequences recognised and bound by MuA transposase and an insert sequence between said 
Mu end sequences, under conditions that allow integration of the transposon segment into 
10 the cellular nucleic acid. 

2. The method according to claim 1, wherein said Mu transposition complex is delivered 
into the target cell by electroporation. 

15 3. The method according to claim 1, wherein the nucleic acid segment is incorporated to a 
random or almost random position of the cellular nucleic acid of the target cell. 

4. The method according to claim 1, wherein the nucleic acid segment is incorporated to a 
targeted position of the cellular nucleic acid of the target cell. 

20 

5. The method according to claim 1, wherein the target cell is human, animal, plant, fungi 
or yeast cell 

6. The method according to claim 5, wherein said animal cell is a mouse cell. 

25 

7. The method according to claim 1, wherein said insert sequence comprises a marker, 
which is selectable in eukaryotic cells. 

8. The method according to claim 1, wherein a concentrated fraction of Mu transposition 
30 complexes are delivered into the target cell, 

9. The method according to claim 1 further comprising the step of incubating the target 
cells under conditions that promote transposition into the cellular nucleic acid. 
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10. A method for forming an insertion mutant library from a pool of eukaryotic target cells, 
the method comprising the steps of: 

a) delivering into the exxkaryotic target cell a Mu transposition complex that comprises 
5 (i) MuA transposases and (ii) a transposon segment that comprises a pair of Mu end 

sequences recognised and bound by MuA transposase and an insert sequence between said 
Mu end sequences, under conditions that allow integration of the transposon segment into 
the cellular nucleic acid; and 

10 b) screening for cells that comprise the selectable marker. 

1 1. A kit for incorporating nucleic acid segments into cellular nucleic acid of a eukaryotic 
target cell comprising a concentrated fraction of Mu transposition complexes with a 
transposon segment that comprises a marker, which is selectable in eukaryotic cells. 

15 



(57) Abstract 



33 



The present invention relates to genetic engineering and especially to the use of DNA 
transposition complex of bacteriophage Mu. In particular, the invention provides a gene 
transfer system for eukaryotic cells, wherein in vitro assembled Mu transposition 
complexes are introduced into a target cell and subsequently transposition into a cellular 
nucleic acid occurs. The invention further provides a kit for producing insertional 
mutations into the genomes of eukaryotic cells. The kit can be used, e.g., to generate 
insertional mutant libraries. 
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SEQUENCE LISTING 

<110> Finnzymes Oy 

<120> Method for delivering nucleic acid into eukaryotic 
genomes 

<140> 
<141> 

<160> 21 

<170> Patentin Ver. 2.1 

<210> 1 
<211> 20 
<212> DNA 

<213> Artificial Secjuence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide primer 

<400> 1 

gctctccccg tggaggtaat 

<210> 2 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide primer 

<400> 2 

ttccgtcaca ggtatttatt cggt 

<210> 3 
<211> 17 
•<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide primer 

<400> 3 

atcagcggcc gcgatcc 

<210> 4 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide primer 



<400> 4 

ggacgaggca agctaaacag 



<210> 5 
<211> 33 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Secjuence 
Oligonucleotide primer 

<400> 5 

ctaataccac tcacataggg cggccgcccg ggc 

<210> 6 
<211> 13 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide primer 

<400> 6 

gatcgcccgg gcg 



<210> 7 
<211> 13 
<212> DNA 

<213> Artificial Sequence 
<220> 

. <223> Description of Artificial Sequence 
. Oligonucleotide primer 

<400> 7 

ctaggcccgg gcg 



<210> 8 
<211> 13 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide primer 

<400> 8 

aattgcccgg gcg 

<210> 9 
<211> 20 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide primer 

<400> 9 

ctaataccac tcacataggg 



<210> 10 
<211> 20 
<212> DNA 

<213> Artificial Secjuence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide primer 

<400> 10 

gggcggccgc ccgggcgatc 

<210> 11 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide primer 

<400> 11 

gggcggccgc ccgggcctag 



<210> 12 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide primer 

<400> 12 

gggcggccgc ccgggcaatt 



<210> 13 

<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide primer 

<400> 13 

ctgtcgattc gatactaacg 



4 



<210> 14 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial 
Oligonucleotide primer 

<400> 14 

ctctagatga tcagcggccg cgatccg 

<210> 15 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial 
Oligonucleotide primer 

<400> 15 

tgtcaaggag ggtattctgg 



<210> 16 
<211> 22 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial 
Oligonucleotide primer 

<400> 16 

ggtgacccgg cggggacgag gc 

<210> 17 
<211> 23 
<212> DNA 

<-213> Artificial Sequence 
<220> 

<223> Description of Artificial 
Oligonucleotide primer 

<400> 17 

gatccgtttt cgcatttatc gtg 



<210> 18 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial 
Oligonucleotide primer 



Sequence : 



27 



Sequence: 



20 



Sequence: 



22 



Sequence : 



23 



Secnience: 



<400> 18 

ggccgcatcg ataagcttgg gctgcagg 

<210> 19 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence 
Oligonucleotide primer 

<400> 19 

acattgggtg gaaacattcc 

<210> 20 
<211> 18 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide primer 

<400> 20 

ccaagttcgg gtgaaggc 

<210> 21 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide primer 

<400> 21 

ccccgggcga gtctagggcc gc 



